Overview

Dataset statistics

Number of variables15
Number of observations1599828
Missing cells199
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory227.5 MiB
Average record size in memory149.1 B

Variable types

Numeric7
DateTime2
Categorical5
Text1

Alerts

Venda has constant value ""Constant
affiliate_id is highly overall correlated with producer_idHigh correlation
producer_id is highly overall correlated with affiliate_idHigh correlation
product_category is highly imbalanced (75.9%)Imbalance
is_origin_page_social_network is highly imbalanced (84.2%)Imbalance
purchase_id has unique valuesUnique
affiliate_commission_percentual has 1338727 (83.7%) zerosZeros

Reproduction

Analysis started2023-06-30 13:38:42.339448
Analysis finished2023-06-30 13:40:10.240381
Duration1 minute and 27.9 seconds
Software versionydata-profiling vv4.3.1
Download configurationconfig.json

Variables

purchase_id
Real number (ℝ)

UNIQUE 

Distinct1599828
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12445457
Minimum1663958
Maximum14357203
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size56.7 MiB
2023-06-30T10:40:10.388580image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1663958
5-th percentile11000520
Q111653743
median12468488
Q313233099
95-th percentile13852541
Maximum14357203
Range12693245
Interquartile range (IQR)1579356.5

Descriptive statistics

Standard deviation917582.02
Coefficient of variation (CV)0.073728273
Kurtosis-0.75565529
Mean12445457
Median Absolute Deviation (MAD)789763
Skewness-0.090470385
Sum1.991059 × 1013
Variance8.4195677 × 1011
MonotonicityStrictly increasing
2023-06-30T10:40:10.581724image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1663958 1
 
< 0.1%
12982115 1
 
< 0.1%
12982152 1
 
< 0.1%
12982151 1
 
< 0.1%
12982150 1
 
< 0.1%
12982149 1
 
< 0.1%
12982148 1
 
< 0.1%
12982147 1
 
< 0.1%
12982146 1
 
< 0.1%
12982143 1
 
< 0.1%
Other values (1599818) 1599818
> 99.9%
ValueCountFrequency (%)
1663958 1
< 0.1%
1677087 1
< 0.1%
2017360 1
< 0.1%
2017379 1
< 0.1%
2017382 1
< 0.1%
2017387 1
< 0.1%
2017442 1
< 0.1%
2017522 1
< 0.1%
2017569 1
< 0.1%
2017571 1
< 0.1%
ValueCountFrequency (%)
14357203 1
< 0.1%
14344113 1
< 0.1%
14343996 1
< 0.1%
14012431 1
< 0.1%
14011995 1
< 0.1%
14011994 1
< 0.1%
14011993 1
< 0.1%
14011992 1
< 0.1%
14011991 1
< 0.1%
14011988 1
< 0.1%

product_id
Real number (ℝ)

Distinct17883
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean148595.81
Minimum4
Maximum319129
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size56.7 MiB
2023-06-30T10:40:10.779837image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile42903
Q1112138
median154310
Q3193934
95-th percentile221931
Maximum319129
Range319125
Interquartile range (IQR)81796

Descriptive statistics

Standard deviation55543.17
Coefficient of variation (CV)0.37378691
Kurtosis-0.70165627
Mean148595.81
Median Absolute Deviation (MAD)40493
Skewness-0.48236628
Sum2.3772774 × 1011
Variance3.0850437 × 109
MonotonicityNot monotonic
2023-06-30T10:40:10.947865image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
219755 41220
 
2.6%
130294 32731
 
2.0%
42903 27228
 
1.7%
63718 24132
 
1.5%
132809 23350
 
1.5%
83377 21601
 
1.4%
149048 16386
 
1.0%
59205 16096
 
1.0%
154310 14455
 
0.9%
132454 11685
 
0.7%
Other values (17873) 1370944
85.7%
ValueCountFrequency (%)
4 2
 
< 0.1%
17 1
 
< 0.1%
27 2
 
< 0.1%
35 3
 
< 0.1%
39 3
 
< 0.1%
78 9
 
< 0.1%
147 8
 
< 0.1%
148 1
 
< 0.1%
153 3
 
< 0.1%
154 70
< 0.1%
ValueCountFrequency (%)
319129 2
 
< 0.1%
241977 1
 
< 0.1%
241903 1
 
< 0.1%
241896 5
 
< 0.1%
241745 1
 
< 0.1%
241639 8
 
< 0.1%
241490 9
 
< 0.1%
241299 1
 
< 0.1%
241145 73
< 0.1%
240950 38
< 0.1%

affiliate_id
Real number (ℝ)

HIGH CORRELATION 

Distinct22947
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2297500.7
Minimum3
Maximum7700836
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size56.7 MiB
2023-06-30T10:40:11.117134image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile21813
Q1442241
median1690428
Q33992235
95-th percentile6563256
Maximum7700836
Range7700833
Interquartile range (IQR)3549994

Descriptive statistics

Standard deviation2092656.2
Coefficient of variation (CV)0.91084027
Kurtosis-0.82260703
Mean2297500.7
Median Absolute Deviation (MAD)1489087
Skewness0.65130346
Sum3.6756059 × 1012
Variance4.3792098 × 1012
MonotonicityNot monotonic
2023-06-30T10:40:11.319479image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6697083 41220
 
2.6%
349701 31969
 
2.0%
166090 24722
 
1.5%
3992235 21437
 
1.3%
464846 17764
 
1.1%
442241 15362
 
1.0%
4580574 15085
 
0.9%
1745680 14100
 
0.9%
42346 13342
 
0.8%
3512 12219
 
0.8%
Other values (22937) 1392608
87.0%
ValueCountFrequency (%)
3 99
 
< 0.1%
59 4
 
< 0.1%
60 1599
0.1%
62 151
 
< 0.1%
80 498
 
< 0.1%
83 1
 
< 0.1%
93 2
 
< 0.1%
111 11
 
< 0.1%
119 1
 
< 0.1%
120 1
 
< 0.1%
ValueCountFrequency (%)
7700836 1
 
< 0.1%
7689657 1
 
< 0.1%
7683014 1
 
< 0.1%
7678978 1
 
< 0.1%
7678701 1
 
< 0.1%
7678099 1
 
< 0.1%
7676162 1
 
< 0.1%
7674668 229
< 0.1%
7671976 1
 
< 0.1%
7670937 11
 
< 0.1%

producer_id
Real number (ℝ)

HIGH CORRELATION 

Distinct8020
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2164479.5
Minimum3
Maximum9868481
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size56.7 MiB
2023-06-30T10:40:11.562529image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile13050
Q1409590
median1377289
Q33776238
95-th percentile6062576
Maximum9868481
Range9868478
Interquartile range (IQR)3366648

Descriptive statistics

Standard deviation2038960.4
Coefficient of variation (CV)0.94200957
Kurtosis-0.69928013
Mean2164479.5
Median Absolute Deviation (MAD)1211199
Skewness0.7239015
Sum3.4627949 × 1012
Variance4.1573596 × 1012
MonotonicityNot monotonic
2023-06-30T10:40:11.781152image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6697083 41220
 
2.6%
3992235 39331
 
2.5%
464846 35470
 
2.2%
349701 34568
 
2.2%
166090 28895
 
1.8%
442241 27798
 
1.7%
2307584 21720
 
1.4%
3382787 20199
 
1.3%
4580574 16386
 
1.0%
671256 16096
 
1.0%
Other values (8010) 1318145
82.4%
ValueCountFrequency (%)
3 99
 
< 0.1%
59 10
 
< 0.1%
60 1639
0.1%
80 530
 
< 0.1%
93 2
 
< 0.1%
111 23
 
< 0.1%
119 3
 
< 0.1%
131 105
 
< 0.1%
147 1
 
< 0.1%
183 11
 
< 0.1%
ValueCountFrequency (%)
9868481 2
 
< 0.1%
7631606 1
 
< 0.1%
7630136 1
 
< 0.1%
7626841 2
 
< 0.1%
7623443 5
 
< 0.1%
7614592 4
 
< 0.1%
7612731 152
< 0.1%
7610636 1
 
< 0.1%
7606265 1
 
< 0.1%
7598719 1
 
< 0.1%

buyer_id
Real number (ℝ)

Distinct1100649
Distinct (%)68.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5187551.3
Minimum60
Maximum12014792
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size56.7 MiB
2023-06-30T10:40:11.969436image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum60
5-th percentile703532
Q13730213
median5999153.5
Q36946337.2
95-th percentile7521357.7
Maximum12014792
Range12014732
Interquartile range (IQR)3216124.2

Descriptive statistics

Standard deviation2199256.6
Coefficient of variation (CV)0.42394888
Kurtosis-0.49247361
Mean5187551.3
Median Absolute Deviation (MAD)1197454.5
Skewness-0.87795167
Sum8.2991899 × 1012
Variance4.8367294 × 1012
MonotonicityNot monotonic
2023-06-30T10:40:12.157362image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4763605 66
 
< 0.1%
158226 62
 
< 0.1%
701523 62
 
< 0.1%
336781 59
 
< 0.1%
1128526 51
 
< 0.1%
137316 44
 
< 0.1%
37734 43
 
< 0.1%
2196492 43
 
< 0.1%
32277 42
 
< 0.1%
292117 41
 
< 0.1%
Other values (1100639) 1599315
> 99.9%
ValueCountFrequency (%)
60 2
 
< 0.1%
65 6
< 0.1%
80 1
 
< 0.1%
91 4
< 0.1%
115 1
 
< 0.1%
120 1
 
< 0.1%
136 5
< 0.1%
139 1
 
< 0.1%
169 1
 
< 0.1%
202 1
 
< 0.1%
ValueCountFrequency (%)
12014792 1
 
< 0.1%
11866607 1
 
< 0.1%
11861782 1
 
< 0.1%
11828705 1
 
< 0.1%
11820644 1
 
< 0.1%
11802198 6
< 0.1%
11795287 1
 
< 0.1%
11688577 2
 
< 0.1%
11639998 1
 
< 0.1%
11625126 1
 
< 0.1%
Distinct1488964
Distinct (%)93.1%
Missing0
Missing (%)0.0%
Memory size56.7 MiB
Minimum2016-01-01 00:00:27
Maximum2016-06-30 23:59:57
2023-06-30T10:40:12.324393image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:40:12.498609image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct17879
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size56.7 MiB
Minimum2008-10-27 01:39:34
Maximum2016-12-31 13:43:50
2023-06-30T10:40:12.705860image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:40:12.895392image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

product_category
Categorical

IMBALANCE 

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size56.7 MiB
Phisical book
1334610 
Podcast
216720 
Workshop
 
38246
eBook
 
3965
Subscription
 
1970
Other values (5)
 
4317

Length

Max length15
Median length13
Mean length12.034274
Min length3

Characters and Unicode

Total characters19252768
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowVideo
2nd rowPodcast
3rd rowPodcast
4th rowPodcast
5th rowPodcast

Common Values

ValueCountFrequency (%)
Phisical book 1334610
83.4%
Podcast 216720
 
13.5%
Workshop 38246
 
2.4%
eBook 3965
 
0.2%
Subscription 1970
 
0.1%
In-class course 1561
 
0.1%
App 1557
 
0.1%
eTicket 847
 
0.1%
Webinar 295
 
< 0.1%
Video 57
 
< 0.1%

Length

2023-06-30T10:40:13.065497image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-30T10:40:13.231750image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
phisical 1334610
45.5%
book 1334610
45.5%
podcast 216720
 
7.4%
workshop 38246
 
1.3%
ebook 3965
 
0.1%
subscription 1970
 
0.1%
in-class 1561
 
0.1%
course 1561
 
0.1%
app 1557
 
0.1%
eticket 847
 
< 0.1%
Other values (2) 352
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
o 2973950
15.4%
i 2674359
13.9%
s 1596229
8.3%
c 1557269
8.1%
a 1553186
8.1%
P 1551330
8.1%
k 1377668
7.2%
h 1372856
7.1%
b 1336875
6.9%
l 1336171
6.9%
Other values (16) 1922875
10.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 16315208
84.7%
Uppercase Letter 1599828
 
8.3%
Space Separator 1336171
 
6.9%
Dash Punctuation 1561
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 2973950
18.2%
i 2674359
16.4%
s 1596229
9.8%
c 1557269
9.5%
a 1553186
9.5%
k 1377668
8.4%
h 1372856
8.4%
b 1336875
8.2%
l 1336171
8.2%
t 219537
 
1.3%
Other values (6) 317108
 
1.9%
Uppercase Letter
ValueCountFrequency (%)
P 1551330
97.0%
W 38541
 
2.4%
B 3965
 
0.2%
S 1970
 
0.1%
I 1561
 
0.1%
A 1557
 
0.1%
T 847
 
0.1%
V 57
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1336171
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1561
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 17915036
93.1%
Common 1337732
 
6.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 2973950
16.6%
i 2674359
14.9%
s 1596229
8.9%
c 1557269
8.7%
a 1553186
8.7%
P 1551330
8.7%
k 1377668
7.7%
h 1372856
7.7%
b 1336875
7.5%
l 1336171
7.5%
Other values (14) 585143
 
3.3%
Common
ValueCountFrequency (%)
1336171
99.9%
- 1561
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19252768
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 2973950
15.4%
i 2674359
13.9%
s 1596229
8.3%
c 1557269
8.1%
a 1553186
8.1%
P 1551330
8.1%
k 1377668
7.2%
h 1372856
7.1%
b 1336875
6.9%
l 1336171
6.9%
Other values (16) 1922875
10.0%

product_niche
Categorical

Distinct25
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size56.7 MiB
Negotiation
279921 
Anxiety management
251675 
Personal finance
186849 
Presentation skills
140460 
Immigration
95660 
Other values (20)
645263 

Length

Max length22
Median length18
Mean length14.304792
Min length7

Characters and Unicode

Total characters22885206
Distinct characters39
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPresentation skills
2nd rowChild psychology
3rd rowPresentation skills
4th rowAnxiety management
5th rowTeaching English

Common Values

ValueCountFrequency (%)
Negotiation 279921
17.5%
Anxiety management 251675
15.7%
Personal finance 186849
11.7%
Presentation skills 140460
8.8%
Immigration 95660
 
6.0%
Government 93365
 
5.8%
YouTube video creation 80206
 
5.0%
Online course creation 60431
 
3.8%
Careers 52929
 
3.3%
Organization 48632
 
3.0%
Other values (15) 309700
19.4%

Length

2023-06-30T10:40:13.405973image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
negotiation 279921
 
11.0%
management 251675
 
9.9%
anxiety 251675
 
9.9%
personal 186849
 
7.3%
finance 186849
 
7.3%
creation 140637
 
5.5%
presentation 140460
 
5.5%
skills 140460
 
5.5%
immigration 95660
 
3.7%
government 93365
 
3.7%
Other values (26) 784042
30.7%

Most occurring characters

ValueCountFrequency (%)
n 2887356
12.6%
e 2499340
10.9%
i 2312161
 
10.1%
a 1910351
 
8.3%
t 1906926
 
8.3%
o 1833993
 
8.0%
r 993746
 
4.3%
951765
 
4.2%
s 895967
 
3.9%
g 887141
 
3.9%
Other values (29) 5806460
25.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 20227400
88.4%
Uppercase Letter 1706041
 
7.5%
Space Separator 951765
 
4.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 2887356
14.3%
e 2499340
12.4%
i 2312161
11.4%
a 1910351
9.4%
t 1906926
9.4%
o 1833993
9.1%
r 993746
 
4.9%
s 895967
 
4.4%
g 887141
 
4.4%
m 885985
 
4.4%
Other values (13) 3214434
15.9%
Uppercase Letter
ValueCountFrequency (%)
P 382634
22.4%
A 297376
17.4%
N 279921
16.4%
G 117400
 
6.9%
O 109063
 
6.4%
T 106767
 
6.3%
I 95660
 
5.6%
Y 80206
 
4.7%
C 71848
 
4.2%
E 66378
 
3.9%
Other values (5) 98788
 
5.8%
Space Separator
ValueCountFrequency (%)
951765
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 21933441
95.8%
Common 951765
 
4.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 2887356
13.2%
e 2499340
11.4%
i 2312161
10.5%
a 1910351
 
8.7%
t 1906926
 
8.7%
o 1833993
 
8.4%
r 993746
 
4.5%
s 895967
 
4.1%
g 887141
 
4.0%
m 885985
 
4.0%
Other values (28) 4920475
22.4%
Common
ValueCountFrequency (%)
951765
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 22885206
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 2887356
12.6%
e 2499340
10.9%
i 2312161
 
10.1%
a 1910351
 
8.3%
t 1906926
 
8.3%
o 1833993
 
8.0%
r 993746
 
4.3%
951765
 
4.2%
s 895967
 
3.9%
g 887141
 
3.9%
Other values (29) 5806460
25.4%

purchase_value
Real number (ℝ)

Distinct32617
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.2484898 × 10-8
Minimum-0.54124
Maximum124.561
Zeros0
Zeros (%)0.0%
Negative1139370
Negative (%)71.2%
Memory size56.7 MiB
2023-06-30T10:40:13.561571image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum-0.54124
5-th percentile-0.522656
Q1-0.453015
median-0.349998
Q30.0649229
95-th percentile1.64689
Maximum124.561
Range125.10224
Interquartile range (IQR)0.5179379

Descriptive statistics

Standard deviation1.0000002
Coefficient of variation (CV)80096785
Kurtosis629.20593
Mean1.2484898 × 10-8
Median Absolute Deviation (MAD)0.153887
Skewness10.816902
Sum0.01997369
Variance1.0000005
MonotonicityNot monotonic
2023-06-30T10:40:13.721878image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-0.359158 69234
 
4.3%
-0.453015 61353
 
3.8%
-0.171445 45098
 
2.8%
-0.522656 34339
 
2.1%
-0.503885 31144
 
1.9%
-0.485114 30408
 
1.9%
-0.415472 26542
 
1.7%
-0.447571 24290
 
1.5%
-0.302844 23694
 
1.5%
0.0162677 20111
 
1.3%
Other values (32607) 1233615
77.1%
ValueCountFrequency (%)
-0.54124 5548
0.3%
-0.541221 1
 
< 0.1%
-0.541052 6
 
< 0.1%
-0.54094 1
 
< 0.1%
-0.540902 4
 
< 0.1%
-0.540883 17
 
< 0.1%
-0.540864 1
 
< 0.1%
-0.540301 16
 
< 0.1%
-0.540039 1
 
< 0.1%
-0.53987 1
 
< 0.1%
ValueCountFrequency (%)
124.561 1
< 0.1%
103.519 1
< 0.1%
100.018 1
< 0.1%
99.9923 1
< 0.1%
98.3913 1
< 0.1%
98.1909 1
< 0.1%
94.0676 1
< 0.1%
82.1413 1
< 0.1%
71.3344 1
< 0.1%
48.2641 1
< 0.1%

affiliate_commission_percentual
Real number (ℝ)

ZEROS 

Distinct279
Distinct (%)< 0.1%
Missing199
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean7.596246
Minimum0
Maximum100
Zeros1338727
Zeros (%)83.7%
Negative0
Negative (%)0.0%
Memory size56.7 MiB
2023-06-30T10:40:13.888580image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile55
Maximum100
Range100
Interquartile range (IQR)0

Descriptive statistics

Standard deviation18.476731
Coefficient of variation (CV)2.4323502
Kurtosis3.7528109
Mean7.596246
Median Absolute Deviation (MAD)0
Skewness2.2590904
Sum12151175
Variance341.38958
MonotonicityNot monotonic
2023-06-30T10:40:14.056634image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1338727
83.7%
50 70931
 
4.4%
60 33794
 
2.1%
30 20419
 
1.3%
40 19606
 
1.2%
55 16135
 
1.0%
80 10544
 
0.7%
36.14 8200
 
0.5%
25 6285
 
0.4%
20 5603
 
0.4%
Other values (269) 69385
 
4.3%
ValueCountFrequency (%)
0 1338727
83.7%
0.01 539
 
< 0.1%
0.1 99
 
< 0.1%
0.35 1
 
< 0.1%
1 5025
 
0.3%
1.5 7
 
< 0.1%
1.58 2
 
< 0.1%
2 253
 
< 0.1%
2.5 8
 
< 0.1%
3 8
 
< 0.1%
ValueCountFrequency (%)
100 10
 
< 0.1%
99 929
 
0.1%
95 13
 
< 0.1%
90 12
 
< 0.1%
89.53 96
 
< 0.1%
87.19 12
 
< 0.1%
86 3
 
< 0.1%
80 10544
0.7%
79 67
 
< 0.1%
77 23
 
< 0.1%

purchase_device
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size56.7 MiB
eReaders
664197 
Desktop
581900 
Smart TV
330073 
Cellphone
 
20708
Tablet
 
2950

Length

Max length9
Median length8
Mean length7.6455294
Min length6

Characters and Unicode

Total characters12231532
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSmart TV
2nd rowSmart TV
3rd rowSmart TV
4th rowSmart TV
5th rowSmart TV

Common Values

ValueCountFrequency (%)
eReaders 664197
41.5%
Desktop 581900
36.4%
Smart TV 330073
20.6%
Cellphone 20708
 
1.3%
Tablet 2950
 
0.2%

Length

2023-06-30T10:40:14.214034image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-30T10:40:14.361669image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
ereaders 664197
34.4%
desktop 581900
30.2%
smart 330073
17.1%
tv 330073
17.1%
cellphone 20708
 
1.1%
tablet 2950
 
0.2%

Most occurring characters

ValueCountFrequency (%)
e 2618857
21.4%
s 1246097
10.2%
a 997220
 
8.2%
r 994270
 
8.1%
t 914923
 
7.5%
d 664197
 
5.4%
R 664197
 
5.4%
o 602608
 
4.9%
p 602608
 
4.9%
k 581900
 
4.8%
Other values (11) 2344655
19.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9641485
78.8%
Uppercase Letter 2259974
 
18.5%
Space Separator 330073
 
2.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 2618857
27.2%
s 1246097
12.9%
a 997220
 
10.3%
r 994270
 
10.3%
t 914923
 
9.5%
d 664197
 
6.9%
o 602608
 
6.3%
p 602608
 
6.3%
k 581900
 
6.0%
m 330073
 
3.4%
Other values (4) 88732
 
0.9%
Uppercase Letter
ValueCountFrequency (%)
R 664197
29.4%
D 581900
25.7%
T 333023
14.7%
S 330073
14.6%
V 330073
14.6%
C 20708
 
0.9%
Space Separator
ValueCountFrequency (%)
330073
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11901459
97.3%
Common 330073
 
2.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 2618857
22.0%
s 1246097
10.5%
a 997220
 
8.4%
r 994270
 
8.4%
t 914923
 
7.7%
d 664197
 
5.6%
R 664197
 
5.6%
o 602608
 
5.1%
p 602608
 
5.1%
k 581900
 
4.9%
Other values (10) 2014582
16.9%
Common
ValueCountFrequency (%)
330073
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12231532
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 2618857
21.4%
s 1246097
10.2%
a 997220
 
8.2%
r 994270
 
8.1%
t 914923
 
7.5%
d 664197
 
5.4%
R 664197
 
5.4%
o 602608
 
4.9%
p 602608
 
4.9%
k 581900
 
4.8%
Other values (11) 2344655
19.2%
Distinct9603
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size56.7 MiB
2023-06-30T10:40:14.696051image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

Total characters17598108
Distinct characters22
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2002 ?
Unique (%)0.1%

Sample

1st rowOrigin ef2b
2nd rowOrigin ef2b
3rd rowOrigin ef2b
4th rowOrigin ef2b
5th rowOrigin ef2b
ValueCountFrequency (%)
origin 1599828
50.0%
ef2b 330077
 
10.3%
5187 167028
 
5.2%
adf0 77857
 
2.4%
18eb 28693
 
0.9%
3ade 13069
 
0.4%
cf02 12066
 
0.4%
d8b2 11033
 
0.3%
cd46 10723
 
0.3%
a144 10049
 
0.3%
Other values (9594) 939233
29.4%
2023-06-30T10:40:15.206395image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 3199656
18.2%
O 1599828
9.1%
r 1599828
9.1%
g 1599828
9.1%
n 1599828
9.1%
1599828
9.1%
f 674319
 
3.8%
2 615308
 
3.5%
e 613920
 
3.5%
b 592814
 
3.4%
Other values (12) 3902951
22.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10775042
61.2%
Decimal Number 3623410
 
20.6%
Uppercase Letter 1599828
 
9.1%
Space Separator 1599828
 
9.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 3199656
29.7%
r 1599828
14.8%
g 1599828
14.8%
n 1599828
14.8%
f 674319
 
6.3%
e 613920
 
5.7%
b 592814
 
5.5%
a 338317
 
3.1%
d 336136
 
3.1%
c 220396
 
2.0%
Decimal Number
ValueCountFrequency (%)
2 615308
17.0%
1 444549
12.3%
7 422736
11.7%
5 417900
11.5%
8 416881
11.5%
0 306486
8.5%
4 268796
7.4%
6 254643
7.0%
3 249862
6.9%
9 226249
 
6.2%
Uppercase Letter
ValueCountFrequency (%)
O 1599828
100.0%
Space Separator
ValueCountFrequency (%)
1599828
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12374870
70.3%
Common 5223238
29.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 3199656
25.9%
O 1599828
12.9%
r 1599828
12.9%
g 1599828
12.9%
n 1599828
12.9%
f 674319
 
5.4%
e 613920
 
5.0%
b 592814
 
4.8%
a 338317
 
2.7%
d 336136
 
2.7%
Common
ValueCountFrequency (%)
1599828
30.6%
2 615308
 
11.8%
1 444549
 
8.5%
7 422736
 
8.1%
5 417900
 
8.0%
8 416881
 
8.0%
0 306486
 
5.9%
4 268796
 
5.1%
6 254643
 
4.9%
3 249862
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17598108
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 3199656
18.2%
O 1599828
9.1%
r 1599828
9.1%
g 1599828
9.1%
n 1599828
9.1%
1599828
9.1%
f 674319
 
3.8%
2 615308
 
3.5%
e 613920
 
3.5%
b 592814
 
3.4%
Other values (12) 3902951
22.2%

is_origin_page_social_network
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size56.7 MiB
0,0
1562977 
1,0
 
36851

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters4799484
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0,0
2nd row0,0
3rd row0,0
4th row0,0
5th row0,0

Common Values

ValueCountFrequency (%)
0,0 1562977
97.7%
1,0 36851
 
2.3%

Length

2023-06-30T10:40:15.366863image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-30T10:40:15.493683image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
0,0 1562977
97.7%
1,0 36851
 
2.3%

Most occurring characters

ValueCountFrequency (%)
0 3162805
65.9%
, 1599828
33.3%
1 36851
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3199656
66.7%
Other Punctuation 1599828
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3162805
98.8%
1 36851
 
1.2%
Other Punctuation
ValueCountFrequency (%)
, 1599828
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4799484
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 3162805
65.9%
, 1599828
33.3%
1 36851
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4799484
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 3162805
65.9%
, 1599828
33.3%
1 36851
 
0.8%

Venda
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size56.7 MiB
1
1599828 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1599828
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 1599828
100.0%

Length

2023-06-30T10:40:15.612895image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-30T10:40:15.733452image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
1 1599828
100.0%

Most occurring characters

ValueCountFrequency (%)
1 1599828
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1599828
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1599828
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1599828
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1599828
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1599828
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1599828
100.0%

Interactions

2023-06-30T10:39:58.740688image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:38.029993image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:41.237708image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:44.571704image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:48.034839image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:51.374313image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:54.965872image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:59.272830image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:38.476371image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:41.688609image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:45.041667image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:48.507358image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:51.878094image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:55.614612image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:59.857496image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:38.936492image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:42.158649image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:45.499031image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:48.977334image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:52.367940image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:56.247065image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:40:00.563612image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:39.394170image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:42.641356image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:45.966844image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:49.450192image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:52.860067image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:56.766956image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:40:01.179935image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:39.836892image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:43.110826image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:46.439574image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:49.903151image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:53.364919image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:57.268562image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:40:01.744565image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:40.286061image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:43.603806image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:46.912910image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:50.380229image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:53.832478image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:57.741637image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:40:02.296638image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:40.766642image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:44.107738image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:47.512847image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:50.887980image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:54.379706image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-30T10:39:58.236776image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Correlations

2023-06-30T10:40:15.840887image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
purchase_idproduct_idaffiliate_idproducer_idbuyer_idpurchase_valueaffiliate_commission_percentualproduct_categoryproduct_nichepurchase_deviceis_origin_page_social_network
purchase_id1.0000.2090.1110.1120.301-0.014-0.0180.0170.0410.0270.016
product_id0.2091.0000.4060.4660.155-0.006-0.1780.0750.2440.1380.084
affiliate_id0.1110.4061.0000.8540.181-0.0560.0400.0620.2250.0720.088
producer_id0.1120.4660.8541.0000.174-0.041-0.0700.1130.2440.0860.093
buyer_id0.3010.1550.1810.1741.0000.0540.0670.0290.0650.1590.017
purchase_value-0.014-0.006-0.056-0.0410.0541.0000.0640.0050.0080.0050.000
affiliate_commission_percentual-0.018-0.1780.040-0.0700.0670.0641.0000.0570.1490.1150.025
product_category0.0170.0750.0620.1130.0290.0050.0571.0000.1490.1190.071
product_niche0.0410.2440.2250.2440.0650.0080.1490.1491.0000.1920.072
purchase_device0.0270.1380.0720.0860.1590.0050.1150.1190.1921.0000.091
is_origin_page_social_network0.0160.0840.0880.0930.0170.0000.0250.0710.0720.0911.000

Missing values

2023-06-30T10:40:02.870715image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
A simple visualization of nullity by column.
2023-06-30T10:40:05.064761image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

purchase_idproduct_idaffiliate_idproducer_idbuyer_idpurchase_dateproduct_creation_dateproduct_categoryproduct_nichepurchase_valueaffiliate_commission_percentualpurchase_devicepurchase_originis_origin_page_social_networkVenda
01663958664020937211623812003972016-06-26 12:00:002011-03-19 15:47:36VideoPresentation skills-0.265302NaNSmart TVOrigin ef2b0,01
116770872350141418282110837642016-06-26 12:00:002010-07-05 01:50:15PodcastChild psychology-0.177077NaNSmart TVOrigin ef2b0,01
220173603566961864261864214361062016-06-26 12:00:002012-06-13 02:59:37PodcastPresentation skills-0.468989NaNSmart TVOrigin ef2b0,01
320173795799811645117038814361182016-06-26 12:00:002013-05-07 08:51:31PodcastAnxiety management-0.401168NaNSmart TVOrigin ef2b0,01
4201738258329126148822125313863572016-06-26 12:00:002013-05-12 08:12:06PodcastTeaching English-0.452489NaNSmart TVOrigin ef2b0,01
520173871788935480519298212744232016-06-26 12:00:002011-12-18 09:31:54Phisical bookOnline course creation-0.503716NaNSmart TVOrigin ef2b0,01
620174423646818689718689712523242016-06-26 12:00:002012-06-25 19:29:42PodcastPresentation skills-0.472931NaNSmart TVOrigin ef2b0,01
7201752253664916959169514361872016-06-26 12:00:002013-03-02 13:47:47Phisical bookAnxiety management-0.288390NaNSmart TVOrigin ef2b0,01
8201756949928863558635546042016-06-26 12:00:002013-01-05 12:18:22PodcastMedia training-0.528100NaNSmart TVOrigin ef2b0,01
920175713725061642661642614361992016-06-26 12:00:002012-07-03 18:02:17PodcastStorytelling-0.503697NaNSmart TVOrigin ef2b0,01
purchase_idproduct_idaffiliate_idproducer_idbuyer_idpurchase_dateproduct_creation_dateproduct_categoryproduct_nichepurchase_valueaffiliate_commission_percentualpurchase_devicepurchase_originis_origin_page_social_networkVenda
1599818140119882007363227112322711277018732016-06-30 23:59:062016-01-27 18:35:10Phisical bookNegotiation-0.4308270.0DesktopOrigin 65db1,01
1599819140119911829283475036203893277018752016-06-30 23:59:312015-10-31 21:40:14Phisical bookAnxiety management0.17206940.0eReadersOrigin adf00,01
15998201401199223470833458033458077018762016-06-30 23:59:342016-06-04 21:01:10WorkshopPersonal finance0.1983490.0eReadersOrigin a5c60,01
15998211401199385984339803460277018772016-06-30 23:59:342014-02-17 14:15:23Phisical bookAnxiety management0.74271750.0DesktopOrigin 7fd60,01
1599822140119944645133218537690877018482016-06-30 23:59:452012-10-30 07:35:14PodcastPresentation skills-0.48511450.0eReadersOrigin 54fd0,01
1599823140119952383627586641758664157361722016-06-30 23:59:572016-06-16 12:10:46Phisical bookPersonal finance-0.3453610.0eReadersOrigin 30220,01
159982414012431612795890225890229460672016-06-30 21:40:112013-06-15 16:41:06Phisical bookPersonal finance-0.4717860.0Smart TVOrigin ef2b0,01
1599825143439962152421186145118614564731722016-05-13 16:45:422016-03-26 17:59:47Phisical bookNegotiation-0.3591580.0Smart TVOrigin ef2b0,01
1599826143441132152421186145118614564731722016-06-22 14:39:052016-03-26 17:59:47Phisical bookNegotiation-0.3591580.0Smart TVOrigin ef2b0,01
1599827143572032152421186145118614564731722016-04-11 19:37:252016-03-26 17:59:47Phisical bookNegotiation-0.3591580.0TabletOrigin 3fcc0,01